discharge letter
Interpretable phenotyping of Heart Failure patients with Dutch discharge letters
Torri, Vittorio, Boonstra, Machteld J., van de Veerdonk, Marielle C., Kalkman, Deborah N., Uijl, Alicia, Ieva, Francesca, Abu-Hanna, Ameen, Asselbergs, Folkert W., Calixto, Iacer
Objective: Heart failure (HF) patients present with diverse phenotypes affecting treatment and prognosis. This study evaluates models for phenotyping HF patients based on left ventricular ejection fraction (LVEF) classes, using structured and unstructured data, assessing performance and interpretability. Materials and Methods: The study analyzes all HF hospitalizations at both Amsterdam UMC hospitals (AMC and VUmc) from 2015 to 2023 (33,105 hospitalizations, 16,334 patients). Data from AMC were used for model training, and from VUmc for external validation. The dataset was unlabelled and included tabular clinical measurements and discharge letters. Silver labels for LVEF classes were generated by combining diagnosis codes, echocardiography results, and textual mentions. Gold labels were manually annotated for 300 patients for testing. Multiple Transformer-based (black-box) and Aug-Linear (white-box) models were trained and compared with baselines on structured and unstructured data. To evaluate interpretability, two clinicians annotated 20 discharge letters by highlighting information they considered relevant for LVEF classification. These were compared to SHAP and LIME explanations from black-box models and the inherent explanations of Aug-Linear models. Results: BERT-based and Aug-Linear models, using discharge letters alone, achieved the highest classification results (AUC=0.84 for BERT, 0.81 for Aug-Linear on external validation), outperforming baselines. Aug-Linear explanations aligned more closely with clinicians' explanations than post-hoc explanations on black-box models. Conclusions: Discharge letters emerged as the most informative source for phenotyping HF patients. Aug-Linear models matched black-box performance while providing clinician-aligned interpretability, supporting their use in transparent clinical decision-making.
- Europe > Netherlands > North Holland > Amsterdam (0.26)
- North America > United States (0.14)
- Europe > Italy > Lombardy > Milan (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.68)
MeDiSumQA: Patient-Oriented Question-Answer Generation from Discharge Letters
Dada, Amin, Koras, Osman Alperen, Bauer, Marie, Butler, Amanda, Smith, Kaleb E., Kleesiek, Jens, Friedrich, Julian
While increasing patients' access to medical documents improves medical care, this benefit is limited by varying health literacy levels and complex medical terminology. Large language models (LLMs) offer solutions by simplifying medical information. However, evaluating LLMs for safe and patient-friendly text generation is difficult due to the lack of standardized evaluation resources. To fill this gap, we developed MeDiSumQA. MeDiSumQA is a dataset created from MIMIC-IV discharge summaries through an automated pipeline combining LLM-based question-answer generation with manual quality checks. We use this dataset to evaluate various LLMs on patient-oriented question-answering. Our findings reveal that general-purpose LLMs frequently surpass biomedical-adapted models, while automated metrics correlate with human judgment. By releasing MeDiSumQA on PhysioNet, we aim to advance the development of LLMs to enhance patient understanding and ultimately improve care outcomes.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Italy > Tuscany > Florence (0.04)
- Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
- (6 more...)
- Health & Medicine > Diagnostic Medicine (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
- Health & Medicine > Health Care Technology > Medical Record (0.46)
- Health & Medicine > Therapeutic Area > Oncology (0.46)
Weakly-supervised diagnosis identification from Italian discharge letters
Torri, Vittorio, Barbieri, Elisa, Cantarutti, Anna, Giaquinto, Carlo, Ieva, Francesca
Objective: Recognizing diseases from discharge letters is crucial for cohort selection and epidemiological analyses, as this is the only type of data consistently produced across hospitals. This is a classic document classification problem, typically requiring supervised learning. However, manual annotation of large datasets of discharge letters is uncommon since it is extremely time-consuming. We propose a novel weakly-supervised pipeline to recognize diseases from Italian discharge letters. Methods: Our Natural Language Processing pipeline is based on a fine-tuned version of the Italian Umberto model. The pipeline extracts diagnosis-related sentences from a subset of letters and applies a two-level clustering using the embeddings generated by the fine-tuned Umberto model. These clusters are summarized and those mapped to the diseases of interest are selected as weak labels. Finally, the same BERT-based model is trained using these weak labels to detect the targeted diseases. Results: A case study related to the identification of bronchiolitis with 33'176 Italian discharge letters from 44 hospitals in the Veneto Region shows the potential of our method, with an AUC of 77.7 % and an F1-Score of 75.1 % on manually annotated labels, improving compared to other non-supervised methods and with a limited loss compared to fully supervised methods. Results are robust to the cluster selection and the identified clusters highlight the potential to recognize a variety of diseases. Conclusions: This study demonstrates the feasibility of diagnosis identification from Italian discharge letters in the absence of labelled data. Our pipeline showed strong performance and robustness, and its flexibility allows for easy adaptation to various diseases. This approach offers a scalable solution for clinical text classification, reducing the need for manual annotation while maintaining good accuracy.
- Europe > Italy > Veneto (0.24)
- Europe > Italy > Lombardy > Milan (0.04)
- North America > United States (0.04)
- North America > Canada > Alberta (0.04)
AI Managed Emergency Documentation with a Pretrained Model
Menzies, David, Kirwan, Sean, Albarqawi, Ahmad
This study investigates the use of a large language model system to improve efficiency and quality in emergency department (ED) discharge letter writing. Time constraints and infrastructural deficits make compliance with current discharge letter targets difficult. We explored potential efficiencies from an artificial intelligence software in the generation of ED discharge letters and the attitudes of doctors toward this technology. The evaluated system leverages advanced techniques to fine-tune a model to generate discharge summaries from short-hand inputs, including voice, text, and electronic health record data. Nineteen physicians with emergency medicine experience evaluated the system text and voice-to-text interfaces against manual typing. The results showed significant time savings with MedWrite LLM interfaces compared to manual methods.
- Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- North America > United States > Illinois > Champaign County > Champaign (0.04)
- Research Report > Experimental Study (0.94)
- Research Report > New Finding (0.87)
QUB-Cirdan at "Discharge Me!": Zero shot discharge letter generation by open-source LLM
Guo, Rui, Farnan, Greg, McLaughlin, Niall, Devereux, Barry
The BioNLP ACL'24 Shared Task on Streamlining Discharge Documentation aims to reduce the administrative burden on clinicians by automating the creation of critical sections of patient discharge letters. This paper presents our approach using the Llama3 8B quantized model to generate the "Brief Hospital Course" and "Discharge Instructions" sections. We employ a zero-shot method combined with Retrieval-Augmented Generation (RAG) to produce concise, contextually accurate summaries. Our contributions include the development of a curated template-based approach to ensure reliability and consistency, as well as the integration of RAG for word count prediction. We also describe several unsuccessful experiments to provide insights into our pathway for the competition. Our results demonstrate the effectiveness and efficiency of our approach, achieving high scores across multiple evaluation metrics.
CLUE: A Clinical Language Understanding Evaluation for LLMs
Dada, Amin, Bauer, Marie, Contreras, Amanda Butler, Koraş, Osman Alperen, Seibold, Constantin Marc, Smith, Kaleb E, Kleesiek, Jens
Large Language Models (LLMs) are expected to significantly contribute to patient care, diagnostics, and administrative processes. Emerging biomedical LLMs aim to address healthcare-specific challenges, including privacy demands and computational constraints. Assessing the models' suitability for this sensitive application area is of the utmost importance. However, evaluation has primarily been limited to non-clinical tasks, which do not reflect the complexity of practical clinical applications. To fill this gap, we present the Clinical Language Understanding Evaluation (CLUE), a benchmark tailored to evaluate LLMs on clinical tasks. CLUE includes six tasks to test the practical applicability of LLMs in complex healthcare settings. Our evaluation includes a total of $25$ LLMs. In contrast to previous evaluations, CLUE shows a decrease in performance for nine out of twelve biomedical models. Our benchmark represents a step towards a standardized approach to evaluating and developing LLMs in healthcare to align future model development with the real-world needs of clinical application. We open-source all evaluation scripts and datasets for future research at https://github.com/TIO-IKIM/CLUE.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Italy > Tuscany > Florence (0.04)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- (5 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)